Overview

Dataset statistics

Number of variables17
Number of observations46428
Missing cells18400
Missing cells (%)2.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.0 MiB
Average record size in memory136.0 B

Variable types

Numeric11
Categorical5
DateTime1

Warnings

name has a high cardinality: 45489 distinct values High cardinality
host_name has a high cardinality: 11081 distinct values High cardinality
neighbourhood has a high cardinality: 219 distinct values High cardinality
df_index is highly correlated with idHigh correlation
id is highly correlated with df_indexHigh correlation
last_review has 9182 (19.8%) missing values Missing
reviews_per_month has 9182 (19.8%) missing values Missing
minimum_nights is highly skewed (γ1 = 21.79076237) Skewed
name is uniformly distributed Uniform
df_index has unique values Unique
id has unique values Unique
number_of_reviews has 9182 (19.8%) zeros Zeros
availability_365 has 17005 (36.6%) zeros Zeros

Reproduction

Analysis started2023-04-27 07:28:23.935200
Analysis finished2023-04-27 07:28:39.757893
Duration15.82 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct46428
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24313.23591
Minimum0
Maximum48894
Zeros1
Zeros (%)< 0.1%
Memory size362.8 KiB
2023-04-27T09:28:39.851568image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2428.35
Q112176.75
median24264.5
Q336383.25
95-th percentile46373.3
Maximum48894
Range48894
Interquartile range (IQR)24206.5

Descriptive statistics

Standard deviation14042.69692
Coefficient of variation (CV)0.5775741643
Kurtosis-1.188349436
Mean24313.23591
Median Absolute Deviation (MAD)12103.5
Skewness0.009876485015
Sum1128814917
Variance197197336.6
MonotocityStrictly increasing
2023-04-27T09:28:40.082276image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
395581
 
< 0.1%
108961
 
< 0.1%
88491
 
< 0.1%
149941
 
< 0.1%
129471
 
< 0.1%
27081
 
< 0.1%
68061
 
< 0.1%
47591
 
< 0.1%
272881
 
< 0.1%
Other values (46418)46418
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
ValueCountFrequency (%)
488941
< 0.1%
488931
< 0.1%
488921
< 0.1%
488911
< 0.1%
488901
< 0.1%

id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct46428
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18918078.01
Minimum2539
Maximum36487245
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:40.214773image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum2539
5-th percentile1209898.9
Q19445461.25
median19544622
Q328937773.5
95-th percentile35225452.3
Maximum36487245
Range36484706
Interquartile range (IQR)19492312.25

Descriptive statistics

Standard deviation10931202.2
Coefficient of variation (CV)0.5778177992
Kurtosis-1.219101857
Mean18918078.01
Median Absolute Deviation (MAD)9799792.5
Skewness-0.08074382881
Sum8.78328526 × 1011
Variance1.194911816 × 1014
MonotocityStrictly increasing
2023-04-27T09:28:40.332320image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
324931121
 
< 0.1%
192743581
 
< 0.1%
200157361
 
< 0.1%
48093371
 
< 0.1%
169294071
 
< 0.1%
59418881
 
< 0.1%
313379001
 
< 0.1%
159709471
 
< 0.1%
229291411
 
< 0.1%
288421661
 
< 0.1%
Other values (46418)46418
> 99.9%
ValueCountFrequency (%)
25391
< 0.1%
25951
< 0.1%
36471
< 0.1%
38311
< 0.1%
50221
< 0.1%
ValueCountFrequency (%)
364872451
< 0.1%
364856091
< 0.1%
364854311
< 0.1%
364850571
< 0.1%
364846651
< 0.1%

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct45489
Distinct (%)98.0%
Missing15
Missing (%)< 0.1%
Memory size362.8 KiB
Hillside Hotel
 
18
Home away from home
 
17
New york Multi-unit building
 
13
Brooklyn Apartment
 
12
Loft Suite @ The Box House Hotel
 
11
Other values (45484)
46342 

Length

Max length179
Median length36
Mean length36.76734966
Min length1

Characters and Unicode

Total characters1706483
Distinct characters768
Distinct categories20 ?
Distinct scripts11 ?
Distinct blocks17 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44878 ?
Unique (%)96.7%

Sample

1st rowClean & quiet apt home by the park
2nd rowSkylit Midtown Castle
3rd rowTHE VILLAGE OF HARLEM....NEW YORK !
4th rowCozy Entire Floor of Brownstone
5th rowEntire Apt: Spacious Studio/Loft by central park
ValueCountFrequency (%)
Hillside Hotel18
 
< 0.1%
Home away from home17
 
< 0.1%
New york Multi-unit building13
 
< 0.1%
Brooklyn Apartment12
 
< 0.1%
Loft Suite @ The Box House Hotel11
 
< 0.1%
Private Room11
 
< 0.1%
Private room10
 
< 0.1%
Artsy Private BR in Fort Greene Cumberland10
 
< 0.1%
Private room in Brooklyn8
 
< 0.1%
Private room in Williamsburg8
 
< 0.1%
Other values (45479)46295
99.7%
(Missing)15
 
< 0.1%
2023-04-27T09:28:40.670981image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
in16203
 
5.7%
room9976
 
3.5%
7815
 
2.8%
bedroom7283
 
2.6%
private7022
 
2.5%
apartment6461
 
2.3%
cozy4946
 
1.8%
apt4410
 
1.6%
brooklyn3918
 
1.4%
studio3897
 
1.4%
Other values (11921)210490
74.5%

Most occurring characters

ValueCountFrequency (%)
237569
 
13.9%
e117726
 
6.9%
o116911
 
6.9%
t100052
 
5.9%
a98692
 
5.8%
r93300
 
5.5%
i90377
 
5.3%
n90086
 
5.3%
l48969
 
2.9%
m47334
 
2.8%
Other values (758)665467
39.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1146598
67.2%
Uppercase Letter251883
 
14.8%
Space Separator237573
 
13.9%
Other Punctuation31772
 
1.9%
Decimal Number22880
 
1.3%
Dash Punctuation6440
 
0.4%
Other Letter2536
 
0.1%
Math Symbol2412
 
0.1%
Close Punctuation1470
 
0.1%
Open Punctuation1331
 
0.1%
Other values (10)1588
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
81
 
3.2%
46
 
1.8%
44
 
1.7%
41
 
1.6%
38
 
1.5%
36
 
1.4%
35
 
1.4%
35
 
1.4%
29
 
1.1%
29
 
1.1%
Other values (518)2122
83.7%
ValueCountFrequency (%)
e117726
 
10.3%
o116911
 
10.2%
t100052
 
8.7%
a98692
 
8.6%
r93300
 
8.1%
i90377
 
7.9%
n90086
 
7.9%
l48969
 
4.3%
m47334
 
4.1%
s45537
 
4.0%
Other values (57)297614
26.0%
ValueCountFrequency (%)
212
27.0%
155
19.8%
105
13.4%
37
 
4.7%
34
 
4.3%
30
 
3.8%
25
 
3.2%
15
 
1.9%
15
 
1.9%
14
 
1.8%
Other values (45)142
18.1%
ValueCountFrequency (%)
B28028
 
11.1%
S24719
 
9.8%
C19963
 
7.9%
A18361
 
7.3%
R16833
 
6.7%
P13884
 
5.5%
E13057
 
5.2%
L12914
 
5.1%
M11116
 
4.4%
N10812
 
4.3%
Other values (33)82196
32.6%
ValueCountFrequency (%)
,8742
27.5%
!7439
23.4%
/4768
15.0%
.4151
13.1%
&3043
 
9.6%
'1026
 
3.2%
*835
 
2.6%
:552
 
1.7%
#501
 
1.6%
"278
 
0.9%
Other values (11)437
 
1.4%
ValueCountFrequency (%)
+1223
50.7%
|858
35.6%
~247
 
10.2%
=32
 
1.3%
>19
 
0.8%
<19
 
0.8%
6
 
0.2%
4
 
0.2%
2
 
0.1%
×1
 
< 0.1%
ValueCountFrequency (%)
18327
36.4%
26019
26.3%
32159
 
9.4%
51919
 
8.4%
01870
 
8.2%
41080
 
4.7%
6503
 
2.2%
7408
 
1.8%
8366
 
1.6%
9229
 
1.0%
ValueCountFrequency (%)
(1278
96.0%
[35
 
2.6%
{8
 
0.6%
8
 
0.6%
2
 
0.2%
ValueCountFrequency (%)
)1416
96.3%
]36
 
2.4%
}8
 
0.5%
8
 
0.5%
2
 
0.1%
ValueCountFrequency (%)
-6374
99.0%
41
 
0.6%
24
 
0.4%
1
 
< 0.1%
ValueCountFrequency (%)
^9
56.2%
`4
25.0%
´3
 
18.8%
ValueCountFrequency (%)
21
56.8%
11
29.7%
5
 
13.5%
ValueCountFrequency (%)
237569
> 99.9%
 4
 
< 0.1%
ValueCountFrequency (%)
188
83.6%
37
 
16.4%
ValueCountFrequency (%)
150
91.5%
14
 
8.5%
ValueCountFrequency (%)
_42
97.7%
1
 
2.3%
ValueCountFrequency (%)
37
82.2%
8
 
17.8%
ValueCountFrequency (%)
$86
100.0%
ValueCountFrequency (%)
²7
100.0%
ValueCountFrequency (%)
181
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1398278
81.9%
Common305302
 
17.9%
Han2226
 
0.1%
Cyrillic191
 
< 0.1%
Inherited164
 
< 0.1%
Katakana136
 
< 0.1%
Hiragana70
 
< 0.1%
Hangul70
 
< 0.1%
Hebrew31
 
< 0.1%
Georgian13
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
81
 
3.6%
46
 
2.1%
44
 
2.0%
41
 
1.8%
38
 
1.7%
36
 
1.6%
35
 
1.6%
35
 
1.6%
29
 
1.3%
29
 
1.3%
Other values (399)1812
81.4%
ValueCountFrequency (%)
237569
77.8%
,8742
 
2.9%
18327
 
2.7%
!7439
 
2.4%
-6374
 
2.1%
26019
 
2.0%
/4768
 
1.6%
.4151
 
1.4%
&3043
 
1.0%
32159
 
0.7%
Other values (118)16711
 
5.5%
ValueCountFrequency (%)
e117726
 
8.4%
o116911
 
8.4%
t100052
 
7.2%
a98692
 
7.1%
r93300
 
6.7%
i90377
 
6.5%
n90086
 
6.4%
l48969
 
3.5%
m47334
 
3.4%
s45537
 
3.3%
Other values (67)549294
39.3%
ValueCountFrequency (%)
7
 
10.0%
3
 
4.3%
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (38)43
61.4%
ValueCountFrequency (%)
а26
13.6%
о18
 
9.4%
т17
 
8.9%
н15
 
7.9%
е13
 
6.8%
к11
 
5.8%
р11
 
5.8%
м10
 
5.2%
с9
 
4.7%
в9
 
4.7%
Other values (23)52
27.2%
ValueCountFrequency (%)
14
 
10.3%
12
 
8.8%
10
 
7.4%
9
 
6.6%
9
 
6.6%
9
 
6.6%
8
 
5.9%
7
 
5.1%
6
 
4.4%
6
 
4.4%
Other values (22)46
33.8%
ValueCountFrequency (%)
16
22.9%
7
10.0%
7
10.0%
6
 
8.6%
5
 
7.1%
4
 
5.7%
4
 
5.7%
3
 
4.3%
2
 
2.9%
2
 
2.9%
Other values (13)14
20.0%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
150
91.5%
14
 
8.5%
ValueCountFrequency (%)
13
100.0%
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1702145
99.7%
CJK2226
 
0.1%
Misc Symbols433
 
< 0.1%
None423
 
< 0.1%
Punctuation396
 
< 0.1%
Dingbats297
 
< 0.1%
Cyrillic191
 
< 0.1%
VS164
 
< 0.1%
Hiragana70
 
< 0.1%
Hangul70
 
< 0.1%
Other values (7)68
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
237569
 
14.0%
e117726
 
6.9%
o116911
 
6.9%
t100052
 
5.9%
a98692
 
5.8%
r93300
 
5.5%
i90377
 
5.3%
n90086
 
5.3%
l48969
 
2.9%
m47334
 
2.8%
Other values (86)661129
38.8%
ValueCountFrequency (%)
188
47.5%
59
 
14.9%
41
 
10.4%
37
 
9.3%
37
 
9.3%
24
 
6.1%
8
 
2.0%
1
 
0.3%
1
 
0.3%
ValueCountFrequency (%)
34
 
8.0%
à27
 
6.4%
ó24
 
5.7%
21
 
5.0%
é15
 
3.5%
15
 
3.5%
14
 
3.3%
·13
 
3.1%
12
 
2.8%
11
 
2.6%
Other values (69)237
56.0%
ValueCountFrequency (%)
150
91.5%
14
 
8.5%
ValueCountFrequency (%)
212
49.0%
105
24.2%
37
 
8.5%
15
 
3.5%
11
 
2.5%
6
 
1.4%
6
 
1.4%
6
 
1.4%
6
 
1.4%
4
 
0.9%
Other values (10)25
 
5.8%
ValueCountFrequency (%)
155
52.2%
30
 
10.1%
25
 
8.4%
15
 
5.1%
14
 
4.7%
11
 
3.7%
8
 
2.7%
5
 
1.7%
5
 
1.7%
4
 
1.3%
Other values (11)25
 
8.4%
ValueCountFrequency (%)
1
100.0%
ValueCountFrequency (%)
81
 
3.6%
46
 
2.1%
44
 
2.0%
41
 
1.8%
38
 
1.7%
36
 
1.6%
35
 
1.6%
35
 
1.6%
29
 
1.3%
29
 
1.3%
Other values (399)1812
81.4%
ValueCountFrequency (%)
16
22.9%
7
10.0%
7
10.0%
6
 
8.6%
5
 
7.1%
4
 
5.7%
4
 
5.7%
3
 
4.3%
2
 
2.9%
2
 
2.9%
Other values (13)14
20.0%
ValueCountFrequency (%)
а26
13.6%
о18
 
9.4%
т17
 
8.9%
н15
 
7.9%
е13
 
6.8%
к11
 
5.8%
р11
 
5.8%
м10
 
5.2%
с9
 
4.7%
в9
 
4.7%
Other values (23)52
27.2%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
13
100.0%
ValueCountFrequency (%)
7
 
10.0%
3
 
4.3%
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (38)43
61.4%
ValueCountFrequency (%)
4
36.4%
2
18.2%
2
18.2%
2
18.2%
1
 
9.1%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
4
57.1%
2
28.6%
1
 
14.3%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

host_id
Real number (ℝ≥0)

Distinct35770
Distinct (%)77.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66451005.18
Minimum2438
Maximum274321313
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:40.804631image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum2438
5-th percentile807362.4
Q17719136.25
median30321517.5
Q3105640471
95-th percentile238903945.7
Maximum274321313
Range274318875
Interquartile range (IQR)97921334.75

Descriptive statistics

Standard deviation77691272.84
Coefficient of variation (CV)1.169151206
Kurtosis0.2656304287
Mean66451005.18
Median Absolute Deviation (MAD)27027079.5
Skewness1.235988162
Sum3.085187268 × 1012
Variance6.035933876 × 1015
MonotocityNot monotonic
2023-04-27T09:28:40.911851image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219517861272
 
0.6%
107434423192
 
0.4%
137358866103
 
0.2%
3028359498
 
0.2%
1224305195
 
0.2%
1609895891
 
0.2%
6139196391
 
0.2%
2254157387
 
0.2%
147501552
 
0.1%
750364352
 
0.1%
Other values (35760)45295
97.6%
ValueCountFrequency (%)
24381
 
< 0.1%
25711
 
< 0.1%
27876
< 0.1%
28452
 
< 0.1%
28681
 
< 0.1%
ValueCountFrequency (%)
2743213131
< 0.1%
2743114611
< 0.1%
2743076001
< 0.1%
2742984531
< 0.1%
2742732841
< 0.1%

host_name
Categorical

HIGH CARDINALITY

Distinct11081
Distinct (%)23.9%
Missing21
Missing (%)< 0.1%
Memory size362.8 KiB
Michael
 
395
David
 
375
John
 
279
Sonder (NYC)
 
272
Alex
 
260
Other values (11076)
44826 

Length

Max length35
Median length6
Mean length6.109595535
Min length1

Characters and Unicode

Total characters283528
Distinct characters199
Distinct categories14 ?
Distinct scripts7 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6646 ?
Unique (%)14.3%

Sample

1st rowJohn
2nd rowJennifer
3rd rowElisabeth
4th rowLisaRoxanne
5th rowLaura
ValueCountFrequency (%)
Michael395
 
0.9%
David375
 
0.8%
John279
 
0.6%
Sonder (NYC)272
 
0.6%
Alex260
 
0.6%
Sarah221
 
0.5%
Daniel217
 
0.5%
Maria199
 
0.4%
Blueground192
 
0.4%
Jessica188
 
0.4%
Other values (11071)43809
94.4%
2023-04-27T09:28:41.177193image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1055
 
2.0%
and589
 
1.1%
michael438
 
0.8%
david416
 
0.8%
sonder367
 
0.7%
john319
 
0.6%
alex309
 
0.6%
laura284
 
0.5%
nyc282
 
0.5%
maria234
 
0.5%
Other values (9966)47391
91.7%

Most occurring characters

ValueCountFrequency (%)
a36141
 
12.7%
e27173
 
9.6%
i23124
 
8.2%
n22810
 
8.0%
r16949
 
6.0%
l14519
 
5.1%
o12101
 
4.3%
t8937
 
3.2%
s8673
 
3.1%
h8593
 
3.0%
Other values (189)104508
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter223752
78.9%
Uppercase Letter51840
 
18.3%
Space Separator5372
 
1.9%
Other Punctuation1509
 
0.5%
Open Punctuation324
 
0.1%
Close Punctuation322
 
0.1%
Dash Punctuation198
 
0.1%
Other Letter106
 
< 0.1%
Decimal Number69
 
< 0.1%
Math Symbol30
 
< 0.1%
Other values (4)6
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
6
 
5.7%
5
 
4.7%
5
 
4.7%
5
 
4.7%
4
 
3.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
2
 
1.9%
2
 
1.9%
Other values (58)68
64.2%
ValueCountFrequency (%)
a36141
16.2%
e27173
12.1%
i23124
10.3%
n22810
10.2%
r16949
 
7.6%
l14519
 
6.5%
o12101
 
5.4%
t8937
 
4.0%
s8673
 
3.9%
h8593
 
3.8%
Other values (54)44732
20.0%
ValueCountFrequency (%)
A6068
11.7%
J5146
 
9.9%
M5070
 
9.8%
S4477
 
8.6%
C3531
 
6.8%
L2760
 
5.3%
D2616
 
5.0%
K2511
 
4.8%
R2374
 
4.6%
E2270
 
4.4%
Other values (28)15017
29.0%
ValueCountFrequency (%)
516
23.2%
712
17.4%
011
15.9%
28
11.6%
47
10.1%
16
 
8.7%
33
 
4.3%
63
 
4.3%
82
 
2.9%
91
 
1.4%
ValueCountFrequency (%)
&1097
72.7%
.294
 
19.5%
/39
 
2.6%
,35
 
2.3%
'24
 
1.6%
@8
 
0.5%
"6
 
0.4%
!4
 
0.3%
:2
 
0.1%
ValueCountFrequency (%)
5366
99.9%
6
 
0.1%
ValueCountFrequency (%)
+30
100.0%
ValueCountFrequency (%)
(324
100.0%
ValueCountFrequency (%)
)322
100.0%
ValueCountFrequency (%)
-198
100.0%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
£1
100.0%
ValueCountFrequency (%)
_1
100.0%
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin275536
97.2%
Common7830
 
2.8%
Han89
 
< 0.1%
Cyrillic56
 
< 0.1%
Hangul9
 
< 0.1%
Hebrew5
 
< 0.1%
Hiragana3
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
a36141
 
13.1%
e27173
 
9.9%
i23124
 
8.4%
n22810
 
8.3%
r16949
 
6.2%
l14519
 
5.3%
o12101
 
4.4%
t8937
 
3.2%
s8673
 
3.1%
h8593
 
3.1%
Other values (70)96516
35.0%
ValueCountFrequency (%)
6
 
6.7%
5
 
5.6%
5
 
5.6%
5
 
5.6%
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
2
 
2.2%
2
 
2.2%
Other values (43)51
57.3%
ValueCountFrequency (%)
5366
68.5%
&1097
 
14.0%
(324
 
4.1%
)322
 
4.1%
.294
 
3.8%
-198
 
2.5%
/39
 
0.5%
,35
 
0.4%
+30
 
0.4%
'24
 
0.3%
Other values (19)101
 
1.3%
ValueCountFrequency (%)
е6
10.7%
н6
10.7%
а6
10.7%
А4
 
7.1%
л4
 
7.1%
и4
 
7.1%
к3
 
5.4%
с3
 
5.4%
й3
 
5.4%
р3
 
5.4%
Other values (12)14
25.0%
ValueCountFrequency (%)
2
22.2%
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
ValueCountFrequency (%)
ד1
20.0%
נ1
20.0%
י1
20.0%
א1
20.0%
ל1
20.0%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII283116
99.9%
None240
 
0.1%
CJK89
 
< 0.1%
Cyrillic56
 
< 0.1%
Punctuation10
 
< 0.1%
Hangul9
 
< 0.1%
Hebrew5
 
< 0.1%
Hiragana3
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
a36141
 
12.8%
e27173
 
9.6%
i23124
 
8.2%
n22810
 
8.1%
r16949
 
6.0%
l14519
 
5.1%
o12101
 
4.3%
t8937
 
3.2%
s8673
 
3.1%
h8593
 
3.0%
Other values (67)104096
36.8%
ValueCountFrequency (%)
é104
43.3%
í23
 
9.6%
á20
 
8.3%
ú19
 
7.9%
ë13
 
5.4%
ô11
 
4.6%
ó9
 
3.8%
è7
 
2.9%
ç5
 
2.1%
ï4
 
1.7%
Other values (19)25
 
10.4%
ValueCountFrequency (%)
6
 
6.7%
5
 
5.6%
5
 
5.6%
5
 
5.6%
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
2
 
2.2%
2
 
2.2%
Other values (43)51
57.3%
ValueCountFrequency (%)
6
60.0%
2
 
20.0%
2
 
20.0%
ValueCountFrequency (%)
2
22.2%
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
ValueCountFrequency (%)
е6
10.7%
н6
10.7%
а6
10.7%
А4
 
7.1%
л4
 
7.1%
и4
 
7.1%
к3
 
5.4%
с3
 
5.4%
й3
 
5.4%
р3
 
5.4%
Other values (12)14
25.0%
ValueCountFrequency (%)
ד1
20.0%
נ1
20.0%
י1
20.0%
א1
20.0%
ל1
20.0%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size362.8 KiB
Manhattan
19855 
Brooklyn
19550 
Queens
5586 
Bronx
 
1072
Staten Island
 
365

Length

Max length13
Median length8
Mean length8.157060395
Min length5

Characters and Unicode

Total characters378716
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBrooklyn
2nd rowManhattan
3rd rowManhattan
4th rowBrooklyn
5th rowManhattan
ValueCountFrequency (%)
Manhattan19855
42.8%
Brooklyn19550
42.1%
Queens5586
 
12.0%
Bronx1072
 
2.3%
Staten Island365
 
0.8%
2023-04-27T09:28:41.392456image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
2023-04-27T09:28:41.461987image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
manhattan19855
42.4%
brooklyn19550
41.8%
queens5586
 
11.9%
bronx1072
 
2.3%
island365
 
0.8%
staten365
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n66648
17.6%
a60295
15.9%
t40440
10.7%
o40172
10.6%
B20622
 
5.4%
r20622
 
5.4%
l19915
 
5.3%
M19855
 
5.2%
h19855
 
5.2%
k19550
 
5.2%
Other values (10)50742
13.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter331558
87.5%
Uppercase Letter46793
 
12.4%
Space Separator365
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
n66648
20.1%
a60295
18.2%
t40440
12.2%
o40172
12.1%
r20622
 
6.2%
l19915
 
6.0%
h19855
 
6.0%
k19550
 
5.9%
y19550
 
5.9%
e11537
 
3.5%
Other values (4)12974
 
3.9%
ValueCountFrequency (%)
B20622
44.1%
M19855
42.4%
Q5586
 
11.9%
S365
 
0.8%
I365
 
0.8%
ValueCountFrequency (%)
365
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin378351
99.9%
Common365
 
0.1%

Most frequent character per script

ValueCountFrequency (%)
n66648
17.6%
a60295
15.9%
t40440
10.7%
o40172
10.6%
B20622
 
5.5%
r20622
 
5.5%
l19915
 
5.3%
M19855
 
5.2%
h19855
 
5.2%
k19550
 
5.2%
Other values (9)50377
13.3%
ValueCountFrequency (%)
365
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII378716
100.0%

Most frequent character per block

ValueCountFrequency (%)
n66648
17.6%
a60295
15.9%
t40440
10.7%
o40172
10.6%
B20622
 
5.4%
r20622
 
5.4%
l19915
 
5.3%
M19855
 
5.2%
h19855
 
5.2%
k19550
 
5.2%
Other values (10)50742
13.4%

neighbourhood
Categorical

HIGH CARDINALITY

Distinct219
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size362.8 KiB
Williamsburg
3771 
Bedford-Stuyvesant
3647 
Harlem
 
2599
Bushwick
 
2442
Upper West Side
 
1814
Other values (214)
32155 

Length

Max length26
Median length12
Mean length11.92599294
Min length4

Characters and Unicode

Total characters553700
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowKensington
2nd rowMidtown
3rd rowHarlem
4th rowClinton Hill
5th rowEast Harlem
ValueCountFrequency (%)
Williamsburg3771
 
8.1%
Bedford-Stuyvesant3647
 
7.9%
Harlem2599
 
5.6%
Bushwick2442
 
5.3%
Upper West Side1814
 
3.9%
Hell's Kitchen1769
 
3.8%
East Village1737
 
3.7%
Upper East Side1692
 
3.6%
Crown Heights1528
 
3.3%
Midtown1211
 
2.6%
Other values (209)24218
52.2%
2023-04-27T09:28:41.676552image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
east6298
 
8.4%
side4376
 
5.8%
williamsburg3771
 
5.0%
harlem3693
 
4.9%
bedford-stuyvesant3647
 
4.9%
upper3506
 
4.7%
heights3504
 
4.7%
village2905
 
3.9%
west2502
 
3.3%
bushwick2442
 
3.3%
Other values (231)38317
51.1%

Most occurring characters

ValueCountFrequency (%)
e50730
 
9.2%
i39813
 
7.2%
s37999
 
6.9%
t36585
 
6.6%
a35885
 
6.5%
l32481
 
5.9%
r32219
 
5.8%
28533
 
5.2%
n24857
 
4.5%
o22938
 
4.1%
Other values (44)211660
38.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter439490
79.4%
Uppercase Letter79604
 
14.4%
Space Separator28533
 
5.2%
Dash Punctuation4172
 
0.8%
Other Punctuation1901
 
0.3%

Most frequent character per category

ValueCountFrequency (%)
e50730
11.5%
i39813
 
9.1%
s37999
 
8.6%
t36585
 
8.3%
a35885
 
8.2%
l32481
 
7.4%
r32219
 
7.3%
n24857
 
5.7%
o22938
 
5.2%
d18794
 
4.3%
Other values (15)107189
24.4%
ValueCountFrequency (%)
H11337
14.2%
S10953
13.8%
B8190
10.3%
W7758
9.7%
E6784
8.5%
C5040
 
6.3%
U3567
 
4.5%
G3554
 
4.5%
F3112
 
3.9%
V2947
 
3.7%
Other values (14)16362
20.6%
ValueCountFrequency (%)
'1778
93.5%
.121
 
6.4%
,2
 
0.1%
ValueCountFrequency (%)
28533
100.0%
ValueCountFrequency (%)
-4172
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin519094
93.8%
Common34606
 
6.2%

Most frequent character per script

ValueCountFrequency (%)
e50730
 
9.8%
i39813
 
7.7%
s37999
 
7.3%
t36585
 
7.0%
a35885
 
6.9%
l32481
 
6.3%
r32219
 
6.2%
n24857
 
4.8%
o22938
 
4.4%
d18794
 
3.6%
Other values (39)186793
36.0%
ValueCountFrequency (%)
28533
82.5%
-4172
 
12.1%
'1778
 
5.1%
.121
 
0.3%
,2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII553700
100.0%

Most frequent character per block

ValueCountFrequency (%)
e50730
 
9.2%
i39813
 
7.2%
s37999
 
6.9%
t36585
 
6.6%
a35885
 
6.5%
l32481
 
5.9%
r32219
 
5.8%
28533
 
5.2%
n24857
 
4.5%
o22938
 
4.1%
Other values (44)211660
38.2%

latitude
Real number (ℝ≥0)

Distinct18791
Distinct (%)40.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.72857167
Minimum40.49979
Maximum40.91306
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:41.788866image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum40.49979
5-th percentile40.64538
Q140.68936
median40.72201
Q340.76333
95-th percentile40.826463
Maximum40.91306
Range0.41327
Interquartile range (IQR)0.07397

Descriptive statistics

Standard deviation0.05519047241
Coefficient of variation (CV)0.001355079988
Kurtosis0.09972499142
Mean40.72857167
Median Absolute Deviation (MAD)0.03641
Skewness0.2583660389
Sum1890946.126
Variance0.003045988245
MonotocityNot monotonic
2023-04-27T09:28:41.900549image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.7181318
 
< 0.1%
40.6941413
 
< 0.1%
40.6844413
 
< 0.1%
40.6863413
 
< 0.1%
40.7135312
 
< 0.1%
40.7117112
 
< 0.1%
40.6853712
 
< 0.1%
40.7618911
 
< 0.1%
40.7192311
 
< 0.1%
40.7612511
 
< 0.1%
Other values (18781)46302
99.7%
ValueCountFrequency (%)
40.499791
< 0.1%
40.506411
< 0.1%
40.507081
< 0.1%
40.508681
< 0.1%
40.508731
< 0.1%
ValueCountFrequency (%)
40.913061
< 0.1%
40.912341
< 0.1%
40.911691
< 0.1%
40.911671
< 0.1%
40.908041
< 0.1%

longitude
Real number (ℝ)

Distinct14563
Distinct (%)31.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.950968
Minimum-74.24442
Maximum-73.71299
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:42.019359image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-74.24442
5-th percentile-74.00326
Q1-73.9821
median-73.95457
Q3-73.9346275
95-th percentile-73.86389
Maximum-73.71299
Range0.53143
Interquartile range (IQR)0.0474725

Descriptive statistics

Standard deviation0.04638583239
Coefficient of variation (CV)-0.0006272511861
Kurtosis4.939479054
Mean-73.950968
Median Absolute Deviation (MAD)0.02488
Skewness1.249716322
Sum-3433395.542
Variance0.002151645447
MonotocityNot monotonic
2023-04-27T09:28:42.142254image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.9567718
 
< 0.1%
-73.9542717
 
< 0.1%
-73.950616
 
< 0.1%
-73.9479116
 
< 0.1%
-73.9513616
 
< 0.1%
-73.9540516
 
< 0.1%
-73.9533216
 
< 0.1%
-73.9572515
 
< 0.1%
-73.9843915
 
< 0.1%
-73.9453715
 
< 0.1%
Other values (14553)46268
99.7%
ValueCountFrequency (%)
-74.244421
< 0.1%
-74.242851
< 0.1%
-74.240841
< 0.1%
-74.239861
< 0.1%
-74.239141
< 0.1%
ValueCountFrequency (%)
-73.712991
< 0.1%
-73.71691
< 0.1%
-73.717951
< 0.1%
-73.718291
< 0.1%
-73.719281
< 0.1%

room_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size362.8 KiB
Entire home/apt
23252 
Private room
22036 
Shared room
 
1140

Length

Max length15
Median length15
Mean length13.47790127
Min length11

Characters and Unicode

Total characters625752
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate room
2nd rowEntire home/apt
3rd rowPrivate room
4th rowEntire home/apt
5th rowEntire home/apt
ValueCountFrequency (%)
Entire home/apt23252
50.1%
Private room22036
47.5%
Shared room1140
 
2.5%
2023-04-27T09:28:42.443826image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
2023-04-27T09:28:42.508480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
home/apt23252
25.0%
entire23252
25.0%
room23176
25.0%
private22036
23.7%
shared1140
 
1.2%

Most occurring characters

ValueCountFrequency (%)
e69680
11.1%
r69604
11.1%
o69604
11.1%
t68540
11.0%
a46428
 
7.4%
46428
 
7.4%
m46428
 
7.4%
i45288
 
7.2%
h24392
 
3.9%
E23252
 
3.7%
Other values (7)116108
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter509644
81.4%
Uppercase Letter46428
 
7.4%
Space Separator46428
 
7.4%
Other Punctuation23252
 
3.7%

Most frequent character per category

ValueCountFrequency (%)
e69680
13.7%
r69604
13.7%
o69604
13.7%
t68540
13.4%
a46428
9.1%
m46428
9.1%
i45288
8.9%
h24392
 
4.8%
n23252
 
4.6%
p23252
 
4.6%
Other values (2)23176
 
4.5%
ValueCountFrequency (%)
E23252
50.1%
P22036
47.5%
S1140
 
2.5%
ValueCountFrequency (%)
46428
100.0%
ValueCountFrequency (%)
/23252
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin556072
88.9%
Common69680
 
11.1%

Most frequent character per script

ValueCountFrequency (%)
e69680
12.5%
r69604
12.5%
o69604
12.5%
t68540
12.3%
a46428
8.3%
m46428
8.3%
i45288
8.1%
h24392
 
4.4%
E23252
 
4.2%
n23252
 
4.2%
Other values (5)69604
12.5%
ValueCountFrequency (%)
46428
66.6%
/23252
33.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII625752
100.0%

Most frequent character per block

ValueCountFrequency (%)
e69680
11.1%
r69604
11.1%
o69604
11.1%
t68540
11.0%
a46428
 
7.4%
46428
 
7.4%
m46428
 
7.4%
i45288
 
7.2%
h24392
 
3.9%
E23252
 
3.7%
Other values (7)116108
18.6%

price
Real number (ℝ≥0)

Distinct337
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.5380159
Minimum10
Maximum350
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:42.586632image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile40
Q165
median100
Q3160
95-th percentile275
Maximum350
Range340
Interquartile range (IQR)95

Descriptive statistics

Standard deviation71.86258104
Coefficient of variation (CV)0.5864513191
Kurtosis0.5184784007
Mean122.5380159
Median Absolute Deviation (MAD)44
Skewness1.030664416
Sum5689195
Variance5164.230553
MonotocityNot monotonic
2023-04-27T09:28:42.700604image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1002051
 
4.4%
1502047
 
4.4%
501534
 
3.3%
601458
 
3.1%
2001401
 
3.0%
751370
 
3.0%
801272
 
2.7%
651190
 
2.6%
701170
 
2.5%
1201130
 
2.4%
Other values (327)31805
68.5%
ValueCountFrequency (%)
1017
< 0.1%
113
 
< 0.1%
124
 
< 0.1%
131
 
< 0.1%
156
 
< 0.1%
ValueCountFrequency (%)
350381
0.8%
34945
 
0.1%
3483
 
< 0.1%
3474
 
< 0.1%
3462
 
< 0.1%

minimum_nights
Real number (ℝ≥0)

SKEWED

Distinct107
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.943180839
Minimum1
Maximum1250
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:42.821331image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile30
Maximum1250
Range1249
Interquartile range (IQR)4

Descriptive statistics

Standard deviation19.8775096
Coefficient of variation (CV)2.86288231
Kurtosis873.9314378
Mean6.943180839
Median Absolute Deviation (MAD)1
Skewness21.79076237
Sum322358
Variance395.1153878
MonotocityNot monotonic
2023-04-27T09:28:42.940750image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
112148
26.2%
211199
24.1%
37506
16.2%
303534
 
7.6%
43106
 
6.7%
52854
 
6.1%
71975
 
4.3%
6694
 
1.5%
14543
 
1.2%
10464
 
1.0%
Other values (97)2405
 
5.2%
ValueCountFrequency (%)
112148
26.2%
211199
24.1%
37506
16.2%
43106
 
6.7%
52854
 
6.1%
ValueCountFrequency (%)
12501
 
< 0.1%
9993
< 0.1%
5005
< 0.1%
4801
 
< 0.1%
4001
 
< 0.1%

number_of_reviews
Real number (ℝ≥0)

ZEROS

Distinct393
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.82771173
Minimum0
Maximum629
Zeros9182
Zeros (%)19.8%
Memory size362.8 KiB
2023-04-27T09:28:43.062082image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median5
Q324
95-th percentile116
Maximum629
Range629
Interquartile range (IQR)23

Descriptive statistics

Standard deviation45.190521
Coefficient of variation (CV)1.896553119
Kurtosis18.94434683
Mean23.82771173
Median Absolute Deviation (MAD)5
Skewness3.640349972
Sum1106273
Variance2042.183188
MonotocityNot monotonic
2023-04-27T09:28:43.173270image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
09182
19.8%
14976
 
10.7%
23318
 
7.1%
32390
 
5.1%
41918
 
4.1%
51526
 
3.3%
61297
 
2.8%
71130
 
2.4%
81080
 
2.3%
9924
 
2.0%
Other values (383)18687
40.2%
ValueCountFrequency (%)
09182
19.8%
14976
10.7%
23318
 
7.1%
32390
 
5.1%
41918
 
4.1%
ValueCountFrequency (%)
6291
< 0.1%
6071
< 0.1%
5971
< 0.1%
5941
< 0.1%
5761
< 0.1%

last_review
Date

MISSING

Distinct1754
Distinct (%)4.7%
Missing9182
Missing (%)19.8%
Memory size362.8 KiB
Minimum2011-03-28 00:00:00
Maximum2019-07-08 00:00:00
2023-04-27T09:28:43.296989image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:43.423620image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

reviews_per_month
Real number (ℝ≥0)

MISSING

Distinct936
Distinct (%)2.5%
Missing9182
Missing (%)19.8%
Infinite0
Infinite (%)0.0%
Mean1.377473017
Minimum0.01
Maximum58.5
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:43.544610image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.04
Q10.19
median0.715
Q32.02
95-th percentile4.67
Maximum58.5
Range58.49
Interquartile range (IQR)1.83

Descriptive statistics

Standard deviation1.690493397
Coefficient of variation (CV)1.227242476
Kurtosis43.0992735
Mean1.377473017
Median Absolute Deviation (MAD)0.615
Skewness3.154919504
Sum51305.36
Variance2.857767925
MonotocityNot monotonic
2023-04-27T09:28:43.656697image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.02886
 
1.9%
0.05856
 
1.8%
1828
 
1.8%
0.03772
 
1.7%
0.04639
 
1.4%
0.16638
 
1.4%
0.08580
 
1.2%
0.09564
 
1.2%
0.06557
 
1.2%
0.11527
 
1.1%
Other values (926)30399
65.5%
(Missing)9182
 
19.8%
ValueCountFrequency (%)
0.0140
 
0.1%
0.02886
1.9%
0.03772
1.7%
0.04639
1.4%
0.05856
1.8%
ValueCountFrequency (%)
58.51
< 0.1%
27.951
< 0.1%
20.941
< 0.1%
19.751
< 0.1%
17.821
< 0.1%

calculated_host_listings_count
Real number (ℝ≥0)

Distinct47
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.672503662
Minimum1
Maximum327
Zeros0
Zeros (%)0.0%
Memory size362.8 KiB
2023-04-27T09:28:43.776220image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile13
Maximum327
Range326
Interquartile range (IQR)1

Descriptive statistics

Standard deviation31.0834363
Coefficient of variation (CV)4.658436754
Kurtosis75.60395988
Mean6.672503662
Median Absolute Deviation (MAD)0
Skewness8.350507449
Sum309791
Variance966.180012
MonotocityNot monotonic
2023-04-27T09:28:43.889049image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
130677
66.1%
26436
 
13.9%
32745
 
5.9%
41354
 
2.9%
5808
 
1.7%
6529
 
1.1%
8396
 
0.9%
7390
 
0.8%
327272
 
0.6%
9225
 
0.5%
Other values (37)2596
 
5.6%
ValueCountFrequency (%)
130677
66.1%
26436
 
13.9%
32745
 
5.9%
41354
 
2.9%
5808
 
1.7%
ValueCountFrequency (%)
327272
0.6%
232192
0.4%
12198
 
0.2%
103103
 
0.2%
96186
0.4%

availability_365
Real number (ℝ≥0)

ZEROS

Distinct366
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean109.6768545
Minimum0
Maximum365
Zeros17005
Zeros (%)36.6%
Memory size362.8 KiB
2023-04-27T09:28:44.012286image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median40
Q3217
95-th percentile358
Maximum365
Range365
Interquartile range (IQR)217

Descriptive statistics

Standard deviation130.4139522
Coefficient of variation (CV)1.189074512
Kurtosis-0.9227839215
Mean109.6768545
Median Absolute Deviation (MAD)40
Skewness0.8064281257
Sum5092077
Variance17007.79893
MonotocityNot monotonic
2023-04-27T09:28:44.136549image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
017005
36.6%
3651122
 
2.4%
364430
 
0.9%
1397
 
0.9%
89334
 
0.7%
5333
 
0.7%
3296
 
0.6%
179273
 
0.6%
90270
 
0.6%
2254
 
0.5%
Other values (356)25714
55.4%
ValueCountFrequency (%)
017005
36.6%
1397
 
0.9%
2254
 
0.5%
3296
 
0.6%
4227
 
0.5%
ValueCountFrequency (%)
3651122
2.4%
364430
 
0.9%
363215
 
0.5%
362150
 
0.3%
361101
 
0.2%

Interactions

2023-04-27T09:28:26.994743image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.096027image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.195094image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.291232image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.387974image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.567884image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.666059image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.763205image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.859637image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:27.959499image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.058928image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.166333image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.274251image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.383217image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.493257image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.597862image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.702028image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.807520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:28.911479image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.016272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.121997image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.218730image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.323678image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.422182image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.522840image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.625242image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.725633image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.826066image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:29.925598image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.027764image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.132471image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.229060image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.330169image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.425647image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.632995image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.734609image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.831902image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:30.928425image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.036614image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.134606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.235013image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.338244image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.443568image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.542404image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.639606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.740697image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.841024image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:31.940290image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.040241image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.141453image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.242981image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.347590image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.455908image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.562287image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.670896image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.780086image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.889029image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:32.995437image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.100637image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.207073image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.313257image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.411184image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.513332image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.747195image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.852404image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:33.955278image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.061339image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.167489image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.314874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.465746image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.574780image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.682502image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.796606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:34.909215image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.010760image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.111496image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.213236image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.321453image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.432583image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.562511image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.672402image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.777083image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.882113image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:35.983661image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.090536image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.191797image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.304536image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.431539image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.535125image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.638343image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.741483image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:36.956002image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.063993image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.167023image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.270638image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.375193image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.481128image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.585995image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.690561image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.794046image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:37.898896image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.002042image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.109524image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.214851image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.317983image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.423431image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.528954image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.647911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.754331image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-04-27T09:28:38.860080image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2023-04-27T09:28:44.252535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-04-27T09:28:44.517530image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-04-27T09:28:44.664671image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-04-27T09:28:44.814023image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2023-04-27T09:28:44.955182image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2023-04-27T09:28:39.076996image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-27T09:28:39.341984image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-04-27T09:28:39.535854image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2023-04-27T09:28:39.628149image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexidnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
002539Clean & quiet apt home by the park2787JohnBrooklynKensington40.64749-73.97237Private room149192018-10-190.216365
112595Skylit Midtown Castle2845JenniferManhattanMidtown40.75362-73.98377Entire home/apt2251452019-05-210.382355
223647THE VILLAGE OF HARLEM....NEW YORK !4632ElisabethManhattanHarlem40.80902-73.94190Private room15030NaTNaN1365
333831Cozy Entire Floor of Brownstone4869LisaRoxanneBrooklynClinton Hill40.68514-73.95976Entire home/apt8912702019-07-054.641194
445022Entire Apt: Spacious Studio/Loft by central park7192LauraManhattanEast Harlem40.79851-73.94399Entire home/apt801092018-11-190.1010
555099Large Cozy 1 BR Apartment In Midtown East7322ChrisManhattanMurray Hill40.74767-73.97500Entire home/apt2003742019-06-220.591129
665121BlissArtsSpace!7356GaronBrooklynBedford-Stuyvesant40.68688-73.95596Private room6045492017-10-050.4010
775178Large Furnished Room Near B'way8967ShunichiManhattanHell's Kitchen40.76489-73.98493Private room7924302019-06-243.471220
885203Cozy Clean Guest Room - Family Apt7490MaryEllenManhattanUpper West Side40.80178-73.96723Private room7921182017-07-210.9910
995238Cute & Cozy Lower East Side 1 bdrm7549BenManhattanChinatown40.71344-73.99037Entire home/apt15011602019-06-091.334188

Last rows

df_indexidnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
464184888536482809Stunning Bedroom NYC! Walking to Central Park!!131529729KendallManhattanEast Harlem40.79633-73.93605Private room7520NaTNaN2353
464194888636483010Comfy 1 Bedroom in Midtown East274311461ScottManhattanMidtown40.75561-73.96723Entire home/apt20060NaTNaN1176
464204888736483152Garden Jewel Apartment in Williamsburg New York208514239MelkiBrooklynWilliamsburg40.71232-73.94220Entire home/apt17010NaTNaN3365
464214888836484087Spacious Room w/ Private Rooftop, Central location274321313KatManhattanHell's Kitchen40.76392-73.99183Private room12540NaTNaN131
464224888936484363QUIT PRIVATE HOUSE107716952MichaelQueensJamaica40.69137-73.80844Private room6510NaTNaN2163
464234889036484665Charming one bedroom - newly renovated rowhouse8232441SabrinaBrooklynBedford-Stuyvesant40.67853-73.94995Private room7020NaTNaN29
464244889136485057Affordable room in Bushwick/East Williamsburg6570630MarisolBrooklynBushwick40.70184-73.93317Private room4040NaTNaN236
464254889236485431Sunny Studio at Historical Neighborhood23492952Ilgar & AyselManhattanHarlem40.81475-73.94867Entire home/apt115100NaTNaN127
46426488933648560943rd St. Time Square-cozy single bed30985759TazManhattanHell's Kitchen40.75751-73.99112Shared room5510NaTNaN62
464274889436487245Trendy duplex in the very heart of Hell's Kitchen68119814ChristopheManhattanHell's Kitchen40.76404-73.98933Private room9070NaTNaN123